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This  paper  describes  a  modular  software  package  for  solving  systems  of  nonlinear  equations 
and  nonlinear  least  squares  problems,  using  a  new  class  of  methods  called  tensor  methods.  It  is 
intended  for  small  to  medium-sized  problems,  say  with  up  to  100  equations  and  unknowns,  in 
cases  where  it  is  reasonable  to  calculate  the  Jacobian  matrix  or  approximate  it  by  finite  differ¬ 
ences  at  each  iteration.  The  software  allows  the  user  to  select  between  a  tensor  method  and  a 
standard  method  based  upon  a  linear  model.  The  tensor  method  models  F(x)  by  a  quadratic 
model,  where  the  second-order  term  is  chosen  so  that  the  model  is  hardly  more  expensive  to 
form,  store,  or  solve  than  the  standard  linear  model.  Moreover,  the  software  provides  two  dif¬ 
ferent  global  strategies,  a  bne  search  and  a  two-dimensional  trust  region  approach.  Test  results 
indicate  that,  in  general,  tensor  methods  are  significantly  more  efficient  and  robust  than  stan¬ 
dard  methods  on  small  and  medium-sized  problems  in  iterations  and  function  evaluations. 
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1.  Introduction 


This  paper  describes  a  modular  software  package  for  solving  systems  of  nonlinear  equations  of 
the  form 

F  :  Rn  —>  Rm ,  m  >  ?i,  (1.1) 

where  F  is  assumed  to  be  at  least  once  continuously  differentiable,  using  a  new  class  of  methods 
called  tensor  methods.  If  m  is  equal  to  n,  the  package  solves  the  nonlinear  equations  prob¬ 
lem,  F(x)  =  0,  while  if  m  is  greater  than  n  it  solves  the  nonlinear  least  squares  problem, 
min xeRn  |[  F(x)  |[2. 

Tensor  methods  base  each  iteration  on  a  quadratic  model  of  the  nonlinear  function, 

M{xc  +  d)  =  F(xc)  +  F'{xc)d  +  ^  Tcdd ,  (1.2) 

where  xc  is  the  current  iterate,  and  Tc  is  a  three-dimensional  object  referred  to  as  a  tensor.  No 
second  derivative  information  is  used  in  forming  the  tensor  term  Tc.  Instead,  Tc  is  formed  by 
asking  the  model  to  interpolate  up  to  y/n  past  function  values  in  a  way  that  hardly  increases  the 
storage  requirements  or  arithmetic  cost  per  iteration  over  standard  linear  model  based  methods. 
The  package  also  provides  an  option  to  use  a  method  based  on  the  standard  linear  model  ((1.2) 
without  the  tensor  term);  it  then  performs  a  standard  Newton  method  for  nonlinear  equations 
or  Gauss-Newton  method  for  nonlinear  least  squares.  The  global  strategy  used  in  either  case 
can  either  be  a  line  search  strategy  or  a  two-dimensional  trust  region  method  over  the  subspace 
spanned  by  the  steepest  descent  direction  and  the  tensor  (or  Newton/Gauss-Newton)  step. 

Required  input  to  the  package  includes  the  dimensions  m  and  n  of  the  problem,  where  m  is 
the  number  of  nonlinear  equations  and  n  is  the  number  of  unknowns;  a  subroutine  to  evaluate 
the  function  F(x);  and  an  estimate  x0  of  the  solution  x *.  The  user  may  provide  a  code  to 
calculate  the  Jacobian  rather  than  having  it  computed  by  finite  differences,  may  choose  the 
standard  method  rather  than  the  tensor  method,  and  may  specify  various  tolerances. 

Upon  completion,  the  program  returns  with  an  approximation  xp  to  the  solution  x*,  the  value 
of  the  sum  of  squares  of  the  function  F(xp ),  the  value  of  the  gradient  G(xp)  =  Ff(xp)T  F(xp) 
of  the  function  |||F(x)||22  at  xp,  and  a  flag  specifying  under  which  stopping  condition  the 
algorithm  was  terminated. 

The  tensor  methods  upon  which  this  software  package  is  based  were  originally  introduced 
by  Schnabel  and  Frank  [11],  for  nonlinear  equations.  One  main  contribution  of  this  paper  is  the 
provision  and  extensive  testing  of  a  software  package  incorporating  these  methods.  In  addition, 
the  extension  of  these  methods  to  nonlinear  least  squares,  and  the  incorporation  of  a  trust  region 
strategy  with  tensor  methods,  are  new  contributions  of  this  paper. 

The  remainder  of  this  paper  is  organized  as  follows.  In  Section  2  we  give  a  brief  overview 
of  tensor  methods  for  nonlinear  least  squares  problems  (tensor  methods  for  nonlinear  equations 
can  be  regarded  as  a  special  case  of  these).  In  Section  3  we  discuss  the  globally  convergent 
modifications  for  tensor  methods  for  systems  of  nonlinear  equations  and  nonlinear  least  squares 
problems.  Section  4  gives  an  overview  of  the  key  features  and  options  provided  by  the  software 
package.  We  then  describe  the  user  interface  to  the  package  in  Section  5,  which  includes  both 
a  simplified  default  calling  sequence  and  a  longer  calling  sequence.  In  Section  6  we  describe  the 
meaning  of  the  input,  input-output,  and  output  parameters  for  the  package.  Section  7  presents 
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the  default  values  provided  by  the  package.  A  few  implementation  dependencies  are  described 
in  Section  8.  Section  9  gives  an  example  of  the  use  of  the  package.  Finally,  in  Section  10  we 
summarize  and  discuss  our  experimental  results  using  the  package,  with  both  line  search  and 
trust  region  strategies,  on  nonsingular  and  singular  test  problems. 

2.  Brief  Overview  of  Tensor  Methods 


Tensor  methods  are  general-purpose  methods  intended  especially  for  problems  where  the  Jaco¬ 
bian  matrix  at  the  solution  is  singular  or  ill-conditioned.  The  idea  is  to  base  each  iteration  upon 
a  model  that  has  more  information  than  the  standard  linear  model  but  is  not  appreciably  more 
expensive  to  form,  store,  or  solve.  Specifically,  each  iteration  is  based  upon  a  quadratic  model 
(1.2)  of  the  nonlinear  function  F(x).  The  particular  choice  of  the  tensor  term  Tc  £  Rmxnxn 
causes  the  second-order  term  Tcdd  in  (1.2)  to  have  a  simple  and  useful  form.  The  tensor  term  is 
chosen  to  allow  the  model  M(xc  +  d)  to  interpolate  values  of  the  function  F(x)  at  past  iterates 
X-k]  that  is,  the  model  should  satisfy 


where 


F(x-k)  =  F(xc)  +  F'(xc)s) k  +  -Tcsksk,  k  =  1,  ...,  p, 


Sk  % —  k  & Cl  h  —  1?  •••}  P' 


(2.1) 


The  past  points  ar_i,  ...,  x-p  are  selected  so  that  the  set  of  directions  { sk }  from  xc  to  the 
selected  points  is  strongly  linearly  independent;  each  direction  sk  is  required  to  make  an  angle 
of  at  least  45  degrees  with  the  subspace  spanned  by  the  previously  selected  past  directions.  The 
procedure  of  finding  linearly  independent  directions  is  implemented  easily  by  using  a  modified 
Gram-Schmidt  algorithm,  and  usually  results  in  p  =  1  or  2. 

After  selecting  the  linearly  independent  past  directions  sk ,  the  tensor  term  is  chosen  by  the 
procedure  of  Schnabel  and  Frank  [11],  which  generalizes  in  a  straightforward  way  to  nonlinear 
least  squares.  Tc  is  chosen  to  be  the  smallest  matrix  that  satisfies  the  interpolation  conditions 
(2.1);  that  is, 


min 

TceRmxnxn 


II  Tc  ||F 


(2.2) 


subject  to  Tcsksk  =  2  (F(x_k)  -  F{xc)  -  F\xc)sk), 


where  ||Tc||ir,  the  Frobenius  norm  of  Xc,  is  defined  by 


II  Tc  !1f2  =  X  X  X 

2—1  j= 1  A:=l 


(2.3) 


The  solution  to  (2.2)  is  the  sum  of  p  rank-one  tensors  whose  horizontal  faces  are  symmetric: 

p 

Tc  —  ^  ^  Q'k’SkSk  i  (2.4) 

k- 1 

where  ak  is  the  k- th  column  of  A  £  Rmxp ,  A  defined  by  A  =  ZM -1,  Z  is  an  (m  x  p)  matrix 
whose  columns  are  Zj  =  2  (F(x_j)  -  F(xc)  -  F'(xc).Sj ),  and  M  is  a  (p  x  p)  matrix  defined 
by  =  ( SiTSj )2,  1  <  i,  j  <  p. 
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If  we  use  the  tensor  term  (2.4),  the  tensor  model  (1.2)  becomes 

M(xc  +  d)  -  F{xc)  +  F\xc)d  +  ^  JZ  ak  {dTSk}2-  (2.5) 

2  k=i 

The  simple  form  of  the  quadratic  term  in  (2.5)  is  the  key  to  being  able  to  efficiently  form, 
store,  and  solve  the  tensor  model.  The  cost  of  forming  the  tensor  term  in  the  tensor  model  is 
O(mnp)  <  0(mn 1-5)  arithmetic  operations,  since  p  <  y/n,  which  is  small  in  comparison  with 
the  0(mn 2)  cost  per  iteration  of  Gauss-Newton  methods.  The  additional  storage  required  is  4 p 
m- vectors,  which  is  small  in  comparison  with  the  storage  for  the  Jacobian  matrix. 

Once  the  tensor  model  (2.5)  is  formed,  a  root  of  the  tensor  model  is  found.  It  is  possible 
that  no  root  exists;  in  this  case  a  least  squares  solution  of  the  model  is  found  instead.  Thus,  in 
general,  we  solve  the  problem 

find  cl  G  Rn  that  minimizes  ||  M(xc  +  cl)  ||2.  (2-6) 

A  generalization  of  the  process  in  Schnabel  and  Frank  [11]  shows  that  the  solution  to  (2.6) 
can  be  reduced  to  the  solution  of  a  small  number  of  quadratic  equations,  m  —  n  -fi  q  quadratic 
equations  in  p  unknowns,  plus  the  solution  of  n  -  q  linear  equations  in  n  —  p  unknowns.  Here  q 
is  equal  to  p  whenever  F\xc)  is  nonsingular  and  usually  when  rank(T1/(a:c))  >  n  -  p;  otherwise, 
q  is  greater  than  p.  Thus  the  system  of  linear  equations  is  square  or  underdetermined,  and  the 
system  of  quadratic  equations  is  equally  determined  or  overdetermined.  The  main  steps  of  the 
algorithm  are  the  following: 

1.  An  orthogonal  transformation  of  the  variable  space  is  used  to  cause  the  m  equations  in  n 
unknowns  to  be  linear  in  n  —  p  variables,  d\  G  Rn~p ,  and  quadratic  only  in  the  remaining 
p  variables,  d2  G  Rp. 

2.  An  orthogonal  transformation  of  the  equations  is  used  to  eliminate  the  n  —  p  transformed 
linear  variables  from  n  —  q  of  the  equations.  The  result  is  a  system  of  m  —  n  -f  q  quadratic 
equations  in  the  p  unknowns,  d2,  plus  a  system  of  n  -  q  equations  in  all  the  variables  that 
is  linear  in  the  n  —  p  unknowns,  d\. 

3.  A  nonlinear  unconstrained  optimization  software  package,  UNCMIN  [12],  is  used  to  minimize 
the  l2  norm  of  the  m  —  n  +  q  quadratic  equations  in  the  p  unknowns,  d2.  (If  p  —  1,  this 
procedure  is  done  analytically  instead.) 

4.  The  system  of  n  —  q  linear  equations  that  is  linear  in  the  remaining  n  —  p  unknowns  is 
solved  for  cl\. 

The  arithmetic  cost  per  iteration  of  the  above  process  is  the  standard  0(mn2)  cost  of  a  QR 
factorization  of  an  m  x  n  matrix,  plus  an  additional  O(mnp)  <  0(mn1’5)  operations,  plus 
the  cost  of  using  UNCMIN  in  Step  3  of  the  algorithm.  The  cost  of  using  UNCMIN  is  expected 
to  be  0(p4)  <  0(n2)  operations,  since  each  iteration  requires  0(p3)  (0(p2q)  when  q  >  p) 

operations  and  a  small  multiple  of  p  iterations  generally  suffice.  Thus,  the  total  cost  of  the 
above  algorithm  is  the  0(mn2)  cost  of  the  standard  method  plus  at  most  an  additional  cost  of 
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0(mn1,5)  arithmetic  operations.  Note  that  in  the  case  when  p  =  1  and  q  >  1,  the  one-variable 
minimization  problem  is  solved  very  inexpensively  in  closed  form;  this  turns  out  to  be  the  most 
common  case  in  practice. 

The  Newton  or  Gauss-Newton  step  is  computed  inexpensively  (in  O(mnp)  operations)  as  a 
by-product  of  the  tensor  step  solution.  Using  the  tensor  step  and  the  Newton  or  Gauss-Newton 
step,  the  global  portion  of  the  algorithm  determines  the  next  iterate,  as  is  described  in  the  next 
section.  The  overall  algorithm  is  summarized  below. 

Algorithm  2.1.  An  Iteration  of  the  Tensor  Method 
Given  m,  n,  xc,  F(xc) 

1.  Calculate  F/(xc)}  and  decide  whether  to  stop,  if  not: 

2.  Select  the  past  points  to  use  in  the  tensor  model  from  among  the 
y/n  most  recent  points. 

3.  Calculate  the  second-order  term  of  the  tensor  model,  Tc,  so  that 

the  tensor  model  interpolates  F(x)  at  all  the  points  selected  in  Step  2. 

4.  Find  the  root  of  the  tensor  model,  or  its  minimizer  (in  the  /2  norm) 
if  it  has  no  real  root . 

5.  If  m  >  n  or  GLOBAL  =  two-dimensional  trust  region  then 

5.1.  Compute  the  standard  step  as  a  by-product  of  the  tensor  model  solution. 

5.2.  Select  the  tensor  or  standard  step  using  Algorithm  3.1. 

6.  Select  x+  using  either  a  line  search  or  a  two-dimensional  trust  region 
global  strategy. 

6.1.  If  GLOBAL  =  line  search  then 

If  m  >  n  perform  Algorithm  3.3,  where  the  search  direction  is 
the  step  selected  in  Step  5.2 
Else  {m  =  n}  perform  Algorithm  3.2. 

Elself  GLOBAL  =  two-dimensional  trust  region  then 

Perform  Algorithm  3.4  using  the  model  selected  in  Step  5.2. 

7.  Set  xc  <—  x+ ,  F(xc)  F(x+),  go  to  Step  1. 

The  reader  may  refer  to  [1],  [2],  [6],  and  [11]  for  more  details  on  tensor  methods  for  nonlinear 
equations  and  nonlinear  least  squares  problems.  These  papers  give  preliminary  indications  that 
tensor  methods  are  more  efficient  and  more  robust  computationally  than  standard  methods,  and 
show  that  tensor  methods  have  a  superior  rate  of  convergence  to  Newton’s  method  on  nonlinear 
equations  problems  where  rankjjF^r*)}  —  n  —  1. 

3.  Globally  Convergent  Modifications  for  Tensor  Methods 

This  section  describes  the  global  strategies  in  the  tensor  algorithm  given  above.  As  with  all 
algorithms  for  nonlinear  equations  and  optimization,  purely  local  tensor  methods  may  fail  to 
converge  if  the  initial  guess  is  far  away  from  the  solution.  To  address  this  problem,  two  types  of 
modifications  are  used  in  general,  line  search  methods  and  trust  region  methods,  and  either  may 
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be  best  for  a  particular  problem.  For  this  reason,  both  of  these  global  methods  are  included  in 
our  software  package. 

This  section  first  describes  the  overall  framework  that  is  used  in  both  the  line  search  and 
trust  region  approaches  for  tensor  methods.  This  framework  involves  a  choice  of  whether  to 
use  the  tensor  step  or  the  Newton/Gauss-Newton  step  as  the  basis  for  the  global  strategy  at  a 
given  iteration.  Next  we  briefly  describe  the  line  search  that  is  used  in  the  line  search  methods. 
Finally,  we  describe  a  new  model  trust  region  approach  for  tensor  methods  that  is  used  in  the 
trust  region  methods. 

3.1.  Globally  Convergent  Framework  for  Tensor  Methods  for  Nonlinear  Least  Squares 

Our  computational  experience  has  shown  that  when  one  is  far  from  the  solution,  it  is  important 
to  sometimes  allow  the  global  step  to  be  based  upon  the  Newton/Gauss-Newton  step  rather  than 
the  tensor  step,  and  we  have  constructed  heuristics  to  make  this  choice.  Our  experimentation 
has  led  to  two  different  sets  of  heuristics,  one  that  is  used  in  both  the  line  search  and  trust  region 
methods  for  nonlinear  least  squares  as  well  as  the  trust  region  method  for  nonlinear  equations, 
and  a  second  that  is  used  in  line  search  methods  for  nonlinear  equations.  They  differ  primarily 
in  how  much  they  bias  the  choice  toward  the  tensor  step.  Both  are  constructed  so  that  close  to 
the  solution,  the  tensor  step  is  nearly  always  selected.  This  section  gives  these  heuristics  and 
the  overall  global  frameworks  that  are  based  upon  them. 

Algorithm  3.1  gives  the  global  framework  that  is  used  for  nonlinear  least  squares  and  for 
trust  region  methods  for  nonlinear  equations.  In  this  framework,  the  Gauss-Newton  step  is 
chosen  whenever  the  tensor  step  is  not  a  descent  direction,  when  the  tensor  step  is  a  minimizer 
of  the  tensor  model  and  does  not  provide  enough  decrease  in  the  tensor  model,  or  when  the 
quadratic  system  of  m  -  n  +  q  equations  in  p  unknowns  cannot  be  solved  by  UNCMIN  [12]  within 
the  iteration  limit.  Otherwise,  the  tensor  step  is  chosen.  In  the  definitions  of  dt  and  Mj ,  the 
Newton  step  and  model  are  used  for  nonlinear  equations,  the  Gauss-Newton  step  and  model  are 
used  for  nonlinear  least  squares. 

Algorithm  3.1.  Global  Framework  for  Tensor  Methods  for  Nonlinear  Least 

Squares  and  for  Trust  Region  Methods  for  Nonlinear  Equations 

Let 

xc  =  current  iterate, 

J(xc )  =  approximation  to  F'(xc), 

g  =  J(xc)T F(xc) ,  the  gradient  of  |  F(x)TF(x)  at  xc, 
dt  =  minimizer  of  the  tensor  model, 

dn  =  Newton  or  Gauss-Newton  step:  - J(xc)~l F(xc)  or  -(J(xc)T J(xc))~lJ(xc)T F(xc) 
if  J(xc )  is  sufficiently  well-conditioned, 

Levenberg-Marquardt  step  ~{J(xc)T J(xc)  +  gl)~l J{xc)T F(xc)  otherwise, 
where  //  =  \J  n  e  ||  J{xc )  ||i  ||  J(xc )  Hoc,  e  =  machine  epsilon, 

Mt  =  tensor  model, 

Myv  =  Newton  or  Gauss-Newton  model. 

IF  (  no  root  or  minimizer  of  the  tensor  model  was  found  )  OR 

((  minimizer  of  the  tensor  model  that  is  not  a  root  was  found  )  AND 

(  ||  MT  (xc  +  dt)  ||2  >  §  (  j|  F(xc)  ||2  +  ||  Mn(xc  +  dn)  |b  )))  OR 
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(  gTdt  > 

THEN 

*+  *“ 
x+  <- 

ELSE 

x+  <— 
x+  — 

ENDIF 


-IQ'4  ||  g  ||2  II  dt  ||2  )) 

xc  +  A  dn,  A  £  (0,1]  selected  by  line  search,  or 

xc  +  a  dn  —  (3  g,  a,  f3  selected  by  trust  region  algorithm 

xc  +  A  dt,  A  6  (0,1]  selected  by  line  search,  or 

xc  +  ol  dt  —  (3  g,  a,  ft  selected  by  trust  region  algorithm 


Algorithm  3.2  gives  the  global  framework  that  is  used  in  line  search  methods  for  nonlinear 
equations.  Its  main  difference  from  Algorithm  3.1  is  that  it  always  tries  the  tensor  step  first, 
whether  or  not  this  step  meets  the  descent  or  model  decrease  conditions  of  Algorithm  3.1.  If 
xc  +  dt  provides  enough  decrease  in  ||F(z)||,  then  it  is  used  as  the  next  iterate.  If  not,  the 
strategy  may  tentatively  compute  global  steps  in  both  the  Newton  and  the  tensor  directions. 
That  is,  the  global  step  =  xc  +  A dn  produced  by  a  line  search  in  the  Newton  direction  dn 
is  calculated.  In  addition,  if  dt  is  a  descent  direction,  the  global  step  x\  =  xc  +  A dt  produced 
by  a  line  search  in  the  tensor  direction  also  is  calculated.  Finally,  we  select  x T  or  x\  depending 
on  whichever  has  the  lower  function  value.  Thus,  this  strategy  may  involve  one  or  more  extra 
function  evaluations  when  both- line  searches  are  performed. 


Algorithm  3.2.  Global  Framework  for  Line  Search  Methods  for  Nonlinear  Equations 
Given  xc,  dni  dt ,  g  as  defined  in  Algorithm  3.1,  and  a  =  10~4. 

slope  :=  gTdt 
Sc  :=  |!|  F(xc)  ||22 

x . —  Xq  -j~  dt 

f+  ■=  III  F(x‘+)  ||22 

If  /+  <  fc  +  a  ■  min{5/ope,0}  then 
return  =  x*+ 

Else 

Find  an  acceptable  rj  in  the  Newton  direction  dn , 
using  Algorithm  3.3 

comment.  Test  if  the  tensor  step  is  sufficiently  descent 
If  gTdt  >  -  10-4  ||  g  ||2  ||  d,  ||2  then 
return  x+  =  x7^ 

Else 

Find  an  acceptable  xl+  in  the  tensor  direction  dti 
using  Algorithm  3.3 
If  ||  )  ||  <  ||  F(x‘+)  ||  then 

return  x+  =  x+ 

Else 

return  r+  =  xl+ 

Endif 


Endif 


Endif 


3.2.  Global  Framework  for  Line  Search  Methods  for  Nonlinear  Equations 

The  line  search  used  in  the  global  frameworks  outlined  above  is  a  standard  quadratic  backtrack¬ 
ing  line  search.  It  starts  with  A  =  1  and  then,  if  xc  +  d  is  not  acceptable,  reduces  A  until 
an  acceptable  xc  +  A d  is  found,  based  upon  a  one-dimensional  quadratic  model  of  F(x)T  F(x). 
Let  us  define 

/(A)  =  |||  F(xc  +  A d)  ||22, 

the  one-dimensional  restriction  of  f(x)  =  |||  F{x)  ||22  to  the  hne  through  zc  in  the  direction 
d.  If  we  need  to  backtrack,  we  use  the  values  of  /( 0),  /'(0),  and  /(A)  to  model  f  and  then 
take  the  value  of  A  that  minimizes  this  model  as  the  next  value  of  A  in  Algorithm  3.3  subject  to 
restrictions  on  how  much  A  can  decrease  at  once  (see,  e.g.,  [4],  pages  126-127  for  more  details). 
This  results  in  the  following  algorithm. 

Algorithm  3.3.  Standard  Quadratic  Backtracking  Line  Search 
Given  xc,  d,  g  =  J(xc)T  F(xc) ,  and  a  =  10~4 . 
slope  :=  gTd 
fc  :=  III  F(xc)  ||22 
A  :=  1.0 

Xp  ;=  xc  T  A  d 

fp  ■=  ill  F(x„)  ib2 

While  fp  >  fc  +  a  •  A  •  slope  do 

temp  :=  —A  •  slope  /(2[/p  -  fc  -  A  •  slope]) 

A  . — max{A<emp,  A/10} 

xp  :=  xc  +  A  d 
fp  :=  ill  F(xp)  ||22 

EndWhile 

3.3.  Trust  Region  Tensor  Methods  for  Nonlinear  Equations  and  Nonlinear  Least 
Squares 

Two  computational  methods — the  locally  constrained  optimal  (or  “hook”)  method  and  the  dog¬ 
leg  method— are  generally  used  for  approximately  solving  the  trust  region  problem  based  on  the 
standard  model, 

minimize  ||  F(xc)  +  J(xc)d  ||22  (3.1) 

subject  to  )|  d  ||2  <  6C, 

where  6C  is  the  current  trust  region  radius.  When  Sc  is  shorter  than  the  standard  step,  the 
locally  constrained  optimal  method  [8]  finds  a  gc  such  that  ||  d{gc)  ||2  «  6C,  where  d(/ic)  = 
~ (J {x C)T J (x c)  +  gI)~lJ(xc)TF{xc).  Then  it  takes  x+  =  xc  +  d(nc).  The  dogleg  method  is 
a  modification  of  the  trust  region  algorithm  introduced  by  Powell  [10].  Rather  than  finding  a 
point  x+  =  xc  +  d(gc)  on  the  curve  d(gc)  such  that  ||  x+  -  xc  ||  %  6C,  it  approximates  this  curve 
by  a  piecewise  linear  function  in  the  subspace  spanned  by  the  Newton  direction  and  the  steepest 
descent  direction  -.J(xc)T F(xc),  and  takes  x+  as  the  point  on  this  piecewise  curve  for  which 
||  x+  —  xc  ||  =  6C.  (See,  e.g.,  [4]  for  more  details.) 


Unfortunately,  these  two  methods  are  difficult  to  extend  to  the  tensor  model,  because  certain 
key  properties  do  not  generalize  to  this  model.  Trust  region  algorithms  based  on  (3.1)  are  well 
defined  because  there  is  always  a  unique  point  on  the  hookstep  or  dogleg  curve  such  that 
IK/JC)||  ~  Additionally,  the  value  of  ||  F(xc)  +  J(xc)d  W22  along  these  curves  decreases 
monotonically  from  xc  to  .rj,  where  x'J  =  xc  +  dn,  which  makes  the  process  reasonable.  These 
properties  do  not  extend  to  the  fourth-order  sum  of  squares  of  the  tensor  model,  which  may  not 
be  convex.  Furthermore,  the  analogous  curve  to  d(gic)  is  more  expensive  to  compute.  For  these 
reasons,  we  consider  a  different  trust  region  approach  for  our  tensor  methods. 

The  trust  region  approach  that  is  used  in  this  package  is  to  solve  a  two-dimensional  trust 
region  problem  over  the  subspace  spanned  by  the  steepest  descent  direction  and  the  tensor  (or 
standard)  step.  The  main  reasons  that  led  us  to  adopt  this  approach  are  that  it  is  easy  to 
construct  and  is  closely  related  to  dogleg-type  algorithms  over  the  same  subspace.  In  addition, 
the  resultant  step  may  be  close  to  the  optimal  trust  region  step  in  practice.  Byrd,  Schnabel,  and 
Shultz  [3]  have  shown  that  for  unconstrained  optimization  using  a  standard  quadratic  model, 
the  analogous  two-dimensional  minimization  approach  produces  nearly  as  much  decrease  in  the 
quadratic  model  as  the  optimal  trust  region  step  in  almost  all  cases. 

The  two-dimensional  trust  region  approach  for  the  tensor  model  computes  an  approximate 
solution  to  the  exact  trust  region  problem 


minimize  ||  F(xc )  +  J(xc)d  -f  ;  E  o-k  { dTSk }2  II2 


fc=i 


subject  to  ||  d  [[2  <  6C, 
by  performing  a  two-dimensional  minimization 


(3.2) 


1  - 

minimize  ||  F(xc)  +  J(xc)d  +  r  E  {dT  *k) 


2  lb2 


subject  to  ||  d  H2  <  Sc 


k= 1 


(3.3) 


where  dt  and  gs  are  the  tensor  step  and  the  steepest  descent  direction,  respectively,  and  Sc  is  the 
trust  region  radius.  This  approach  always  produces  a  step  that  reduces  the  quadratic  model  by 
at  least  as  much  as  a  dogleg- type  algorithm,  which  minimizes  the  model  over  a  piecewise  linear 
curve  in  the  same  subspace.  When  Algorithm  3.1  chooses  the  Newton  or  Gauss-Newton  step, 
we  instead  solve  the  variant  of  (3.3)  where  dt  is  replaced  by  dn  and  the  quadratic  term  in  the 
model  is  omitted. 

Before  we  give  the  complete  two-dimensional  trust  region  algorithm  for  tensor  methods, 
we  show  how  to  convert  the  problem  (3.3)  into  an  unconstrained  minimization  problem  in  one 
variable.  This  transformation  is  the  key  to  solving  (3.3)  efficiently.  First,  we  form  an  orthonormal 
basis  for  the  two-dimensional  subspace  by  performing  the  projection 


dt 


and  normalizing  gs  and  dt  to  obtain 


(3.4) 


dt  = 


9s  = 


9s 

9s  II2  ’ 


(3.5) 
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Since  d  is  in  the  subspace  spanned  by  dt  and  gs,  it  can  be  written  as 

d  =  adt  +  (3g3,  a,(3  €  (3.6) 

If  we  square  the  /2  norm  of  this  expression  for  d  and  set  it  to  we  obtain  the  following  equation 
for  (5  as  a  function  of  a: 

&  =  d6c  -  «2- 

Substituting  this  expression  for  (3  into  (3.6)  and  then  the  resulting  d  into  (3.3)  yields  the  global 
minimization  problem  in  the  one  variable  a, 

minimize  |(  F(xc)  +  aJ(xe)d,  +  -  a2J<xc)gs  +  §  i  akias^d,  +  \ZW^oFslgs)2)\\22, 

(3.7) 

where  —  Sc  <  a  <  Sc.  Thus,  problems  (3.7)  and  (3.3)  are  equivalent. 

We  use  the  same  procedure  to  convert  the  problem 

minimize  ||  F(xc)  +  J(xc)d  W22  (3.8) 

subject  to  ||  d  H2  <  Sc,  d  G  [  dn,g  ] 
to  the  equivalent  global  minimization  problem  in  the  one  variable  a, 

minimize 1 1  F(xc)  +  aJ(xc)dn  +  y/ 6*  -  a2  J(xc)gs  ||22,  (3.9) 

where  —  Sc  <  a  <  6C. 

The  two-dimensional  trust  region  method  for  tensor  methods  is  given  in  the  following  algo¬ 
rithm. 


Algorithm  3.4.  Two-Dimensional  Trust  Region  for  Tensor  Methods 

Given  xc,  dn,  dt  as  defined  in  Algorithm  3.1. 

Let  gs  =  —J(xc)T  F(xc),  the  steepest  descent  direction; 

6C  the  current  trust  region  radius; 
dt  and  gs  given  by  (3.5); 
dn  obtained  in  an  analogous  way  to  dt 

by  applying  transformations  (3.4)  and  (3.5)  to  dn . 

If  tensor  model  selected  then 
Solve  problem  (3.7) 
d  =  a*  dt  +  gs  \Jb2c  -  al 

where  a*  is  the  global  minimizer  of  (3.7) 

Else  {  standard  model  selected  } 

Solve  problem  (3.9) 
d  =  a*  dn  +  gs  y/F*  -  a\ 

where  a*  is  the  global  minimizer  of  (3.9) 
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Endif 

{Check  new  iterate  and  update  trust  region  radius} 


x+  =  x 

ill 
2  1 1 


If 


;  4*  d 

F(X+)  ||2: 


F(XC)  ||; 


pred 


the  global  step  d  is  successful 
Else 

decrease  trust  region 
go  to  Step  1 
Endif 
where 

pred  =  |||  F(xc)  +  J(xc)d  + 

tensor  model  selected, 
pred  =  |||  F(xc)  +  J(xc)d  ||22 

standard  model  selected. 


>  10  4  then 


ELi  MdTsky2  ||22 

|||  F{xc)  ||22,  if 


ill  F{xc) 


lb2. 


if 


The  methods  used  for  adjusting  the  trust  radius  during  and  between  steps  are  given  in 
Algorithm  A6.4.5  [9,  p.  338].  The  initial  trust  radius  can  be  supplied  by  the  user;  if  not,  it  is  set 
to  the  length  of  the  initial  Cauchy  step.  Our  software  solves  the  one  variable  global  optimization 
problem  by  a  straightforward  partitioning  scheme  described  in  [2]. 


4.  Overview  of  the  Software  Package 

This  section  summarizes  the  key  features  of  the  software  package. 

The  user  has  the  option  to  solve  systems  of  nonlinear  equations  or  nonlinear  least  squares 
problems.  In  either  case,  the  required  input  for  the  software  is  the  number  of  equations  M,  the 
number  of  variables  N,  the  function  FVEC  that  computes  F(x),  and  an  initial  guess  X0.  If  M  = 
N,  the  problem  is  nonlinear  equations;  if  M  >  N  it  is  nonlinear  least  squares.  The  user  does  not 
have  to  set  a  flag  differentiating  between  the  two  problems. 

Two  methods  of  calling  the  package  are  provided.  In  the  short  version,  the  user  supplies 
only  the  above  information,  and  default  values  of  all  other  options  and  parameters  are  used. 
(These  include  the  use  of  the  tensor  rather  than  the  standard  method,  the  use  of  the  line  search 
global  strategy,  and  the  calculation  of  the  Jacobian  by  finite  differences).  In  the  other  method 
for  calling  the  package,  the  user  may  override  any  default  values  of  the  package  options  and 
parameters. 

The  package  allows  the  user  to  use  the  tensor  method  or  the  standard  Newton  or  Gauss- 
Newton  method.  METHOD  =  1  specifies  the  tensor  method  and  is  the  default  value.  If  the  flag 
METHOD  is  set  to  0,  the  package  will  use  the  standard  method. 

Two  global  strategies  are  implemented  in  the  software  package,  a  line  search  method,  and  a 
two-dimensional  trust  region  method  over  the  subspace  spanned  by  the  steepest  descent  direction 
and  the  tensor  (or  Newton/Gauss-Newton)  step.  The  global  strategy  may  be  specified  using  the 
parameter  GLOBAL.  GLOBAL  =  0  is  the  default  and  specifies  the  line  search.  GLOBAL  =  1  specifies 
the  trust  region. 

The  user  may  supply  an  analytic  routine  to  evaluate  the  Jacobian  matrix.  If  it  is  not 
supplied,  the  package  computes  the  Jacobian  by  finite  differences.  The  finite  difference  routine 
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is  described  in  detail  by  Dennis  and  Schnabel  [4].  The  parameter  JACFLG  specifies  whether  an 
analytic  Jacobian  has  been  provided.  The  default  value,  which  specifies  finite  differences,  is 
JACFLG  =  0.  When  the  analytic  Jacobian  is  supplied,  the  user  has  the  option  of  checking  the 
supplied  analytic  routine  against  the  package’s  finite  difference  routine;  if  MSG  is  set  to  2  modulo 
4,  the  package  will  not  check  the  analytic  Jacobian  against  the  finite  difference  one;  otherwise 
it  will. 

Scaling  information  for  the  variables  and/or  the  functions  may  be  supplied  by  the  user. 
The  software  package  is  coded  so  that  if  the  user  inputs  the  typical  magnitude  typxi  of  each 
component  of  x  and/or  the  typical  magnitude  typ ft  of  each  component  of  the  function  F,  the 
performance  of  the  package  is  equivalent  to  what  would  result  from  redefining  the  independent 
variable  x  in  the  user’s  function  and  the  components  of  the  function  F  with 


l/typx1 


xscaled 


(4.1) 


and/or 


l/typxn 


i/typfi 


Fscaled  = 


(4.2) 


L  1  /typfm  J 

respectively,  and  running  the  package  without  scaling.  The  default  value  of  each  typxi  and  typfi 
is  1  (i.e.,  no  scaling).  Scaling  is  often  important  to  use  for  problems  in  which  there  is  great 
variation  in  the  magnitudes  of  individual  variables  and/or  function  components. 

The  package  includes  a  module  INCHK  that  examines  the  input  parameters  for  illegal  entries 
and  consistency.  Certain  illegal  or  inconsistent  entries  are  reset  to  default  values  by  this  module, 
while  other  illegal  entries  cause  the  package  to  terminate.  Details  are  given  in  the  parameter 
listing  in  Section  6. 

The  standard  (default)  output  from  this  package  consists  of  printing  the  input  parameters, 
the  final  results,  and  the  stopping  condition.  The  printed  input  parameters  are  those  used  by 
the  algorithm  and  hence  include  any  corrections  made  by  the  module  INCHK.  The  program  will 
provide  an  error  message  if  it  terminates  as  a  result  of  input  errors.  The  printed  results  include  a 
message  indicating  the  reason  for  termination,  an  approximation  xv  to  the  solution  z*,  the  value 
of  the  sum  of  squares  of  the  function  F(xp),  and  the  gradient  vector  G(xp)  =  F'(xp)T F(xp)  of 
the  function  |||ir(x)||22  at  xp.  The  package  provides  an  additional  means  for  controlling  the 
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output  by  means  of  the  variable  MSG,  described  in  Section  6.  The  user  may  suppress  all  output 
or  may  print  the  intermediate  iterations  results  in  addition  to  the  standard  output. 

5.  Interfaces  and  Usage 

Two  interfaces  are  provided  with  the  system.  NONLQO  requires  the  user  to  provide  only  the 
dimensions  M  and  N  of  the  problem,  a  subroutine  to  evaluate  the  function  F,  and  a  starting 
vector  X0  (as  well  as  three  work  arrays  and  their  dimensions).  NONLQ  requires  the  user  to  supply 
all  parameters.  However,  the  user  may  specify  selected  parameters  only  by  first  invoking  the 
subroutine  DFAULT,  which  sets  all  parameters  to  their  default  values,  and  then  overriding  only 
the  desired  values.  This  is  the  normal  usage  of  NONLQ. 

The  two  calling  sequences  are  as  follows. 

1.  CALL  NONLQO (NRM,  NRN ,  NC,  XO,  M,  N,  WRKUNC,  WRKNEM ,  WRKNEN ,  IWRKN , 

FVEC ,  MSG,  XP,  FP,  GP,  TERMCD) 

2.  CALL  DFAULT (M,  N,  ITNLIM ,  JACFLG,  GRADTL,  STEPTL ,  FTOL , 

METHOD,  GLOBAL,  STEPMX,  DLT,  TYPX,  TYPF ,  IPR,  MSG) 

C  USER  OVERRIDES  SPECIFIC  DEFAULT  VALUES  PARAMETERS,  E.G. 

GRADTL  =  1.0D-6 

STEPTL  =  1.0D-7 

FTOL  =  1.0D-10 

JACFLG  =  1 

CALL  NONLQ (NRM,  NRN,  NC,  XO ,  M,  N,  TYPX,  TYPF,  ITNLIM,  JACFLG, 

GRADTL,  STEPTL,  FTOL,  METHOD,  GLOBAL,  STEPMX,  DLT,  IPR, 

WRKUNC,  WRKNEM,  WRKNEN,  IWRKN,  ANJA,  FVEC,  MSG,  XP,  FP ,  GP ,  TERMCD) 

6.  Parameters  and  Default  Values 

The  parameters  employed  with  the  calling  sequences  of  Section  5  are  fully  described  here.  NONLQO 
uses  only  those  parameters  that  are  preceded  by  an  asterisk.  When  it  is  noted  that  module 
DFAULT  returns  a  given  value,  this  is  the  value  employed  by  interface  NONLQO.  The  user  may 
override  the  default  value  by  utilizing  NONLQ  as  shown  above. 

Following  each  variable  name  in  the  list  below  appears  a  one-  or  a  two-headed  arrow  symbol 

of  the  form  and  <■ - >.  These  symbols  signify  that  the  variable  is  for  input,  output,  and 

input-output,  respectively. 

The  symbol  e  in  some  parts  of  this  section  designates  the  machine  epsilon  (see  Section  8). 

*NRM— A  positive  integer  specifying  the  row  dimension  of  the  work  array  WRKNEM  in  the  user’s 
calling  program.  NRM  must  satisfy  the  relation  NRM  >  M  +  N;  if  not,  the  program  will  abort.  The 
provision  of  this  variable,  NRN,  and  NC  (below)  allows  the  user  the  flexibility  of  solving  several 
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problems  with  different  values  of  M  and  N  one  after  the  other,  with  the  same  work  arrays. 

*NRN— :  A  positive  integer  specifying  the  row  dimension  of  the  work  array  WRKNEN  in  the  user’s 
calling  program.  NRN  must  satisfy  the  relation  NRN  >  N;  if  not,  the  program  will  abort. 

*NC— :  A  positive  integer  specifying  the  row  dimension  of  the  work  array  WRKUNC  in  the  user’s 
calling  program.  NC  must  satisfy  the  relation  NC  >  |Vn];  if  not,  the  program  will  abort. 

*X0— :  An  array  of  length  N  that  contains  an  initial  estimate  of  the  solution  x*. 

*M^>:  A  positive  integer  specifying  the  number  of  nonlinear  equations.  The  program  will  abort 
if  M  <  0. 

*N— >:  A  positive  integer  specifying  the  number  of  variables  in  the  problem.  The  program 
will  abort  if  N  <  0. 

TYPX— An  array  of  length  N  in  which  the  typical  size  of  the  components  of  X  is  specified. 
The  typical  component  sizes  should  be  positive  real  scalars.  If  a  negative  value  is  specified,  its 
absolute  value  will  be  used.  If  0.  is  specified,  1.  will  be  used.  This  vector  is  used  by  the  package 
to  determine  the  scaling  matrix,  Dx.  Although  the  package  may  work  reasonably  well  in  a  large 
number  of  instances  without  scaling,  it  may  fail  when  the  components  of  x*  are  of  radically 
different  magnitude  and  scaling  is  not  invoked.  If  the  sizes  of  the  parameters  are  known  to  differ 
by  many  orders  of  magnitude,  then  the  scale  vector  TYPX  should  definitely  be  used.  For  example, 
if  it  is  anticipated  that  the  range  of  values  for  the  iterates  Xk  would  be 

x\  G  [-1010,  1010] 

2-2  €  [-102  ,  104] 

23  €  [-6  x  1(T6,  9  x  10~6] 

then  an  appropriate  choice  would  be  TYPX  =  (1.0D10,  1.0D3,  7.0D-6).  Module  DFAULT  returns 
TYPX  =  (1.0D0,  ...,  1.0D0). 

TYPF— An  array  of  length  M  in  which  the  typical  size  of  the  components  of  F  is  specified. 
The  typical  component  sizes  should  be  positive  real  scalars.  If  a  negative  value  is  specified, 
its  absolute  value  will  be  used.  If  0.  is  specified,  1.  will  be  used.  This  vector  is  used  by  the 
package  to  determine  the  scaling  matrix  Dp.  TYPF  should  be  chosen  so  that  all  the  components 
of  Dp(x)  have  similar  typical  magnitudes  at  points  not  too  near  a  root,  and  should  be  chosen 
in  conjunction  with  FTOL.  It  is  important  to  supply  values  of  TYPF  when  the  magnitudes  of 
the  components  of  F{ x)  are  expected  to  be  very  different;  in  this  case,  the  program  may  work 
better  with  good  scaling  information  than  with  Dp  —  I.  If  the  magnitudes  of  the  components 
of  F(x)  are  similar,  the  choice  Dp  =  I  suffices.  Module  DFAULT  returns  TYPF  =  (1.0D0, ...,  1.0D0). 

ITNLIM-+:  Positive  integer  specifying  the  maximum  number  of  iterations  to  be  performed  before 
the  program  is  terminated.  Module  DFAULT  returns  ITNL-IM  =  150.  If  the  user  specifies  ITNLIM 
<  0,  the  module  INCHK  will  supply  the  value  150. 
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JACFLG— :  Integer  flag  designating  whether  or  not  an  analytic  Jacobian  has  been  supplied  by 
the  user. 

•  JACFLG  =  0  :  No  analytic  Jacobian  supplied. 

•  JACFLG  =  1  :  Analytic  Jacobian  supplied. 

When  JACFLG  =  0,  the  Jacobian  is  obtained  by  finite  differences.  The  module  D FAULT  returns 
the  value  0.  If  the  user  specifies  an  illegal  value,  the  module  INCHK  will  supply  the  value  0. 


GRADTL— *:  Positive  scalar  giving  the  tolerance  at  which  the  scaled  gradient  of  f(x)  =  \F(x)T F\x) 
is  considered  close  enough  to  zero  to  terminate  the  algorithm.  The  scaled  gradient  is  a  mea¬ 
sure  of  the  relative  change  in  F  in  each  direction  xt  divided  by  the  relative  change  in  xF  More 
precisely,  the  test  used  by  the  program  is 


max 


V/(ff)  ]t  Tnax{\  Xj  |,TYPXt] 
max {F norm ,  n/2} 


|  <  GRADTL. 


Here  V/(z)  =  J(x)T  Df2F(x),  and  Fnorm  =  \\\DfF{x)\\22  where  DF  =  diag(  1/TYPF1} l/TYPFm). 
The  module  DFAULT  returns  the  value  e1/3.  If  the  user  specifies  a  negative  value,  the  module 
INCHK  will  supply  the  value  e1/3  . 


STEPTL-*:  A  positive  scalar  providing  the  minimum  allowable  relative  step  length.  STEPTL 
should  be  at  least  as  small  as  lQ~d,  where  d  is  the  number  of  accurate  digits  the  user  desires  in 
the  solution  x*.  The  actual  test  used  is 

max  (  — ----Vr/* - 1  <  STEPTL. 

*  {  ma x { | xt‘ fc,TYPX,-|}  J  ~ 

The  program  may  terminate  prematurely  if  STEPTL  is  too  large.  Module  DFAULT  returns  the 
value  c2/3.  If  the  user  specifies  a  negative  value,  the  module  INCHK  will  supply  the  value  e2/3. 


FTOL—*:  A  positive  scalar  giving  the  tolerance  at  which  the  scaled  function  DfF(x)  is  con¬ 
sidered  close  enough  to  zero  to  terminate  the  algorithm.  The  program  is  halted  if  ||£>/r.F(z)||O0 
is  <  FTOL.  This  is  the  primary  stopping  condition  for  nonlinear  equations;  the  values  of  TYPF 
and  FTOL  should  be  chosen  so  that  this  test  reflects  the  user’s  idea  of  what  constitutes  a  solution 
to  the  problem.  The  module  DFAULT  returns  the  value  e2/3.  If  the  user  specifies  a  negative  value, 
the  module  INCHK  will  supply  the  value  e2/3. 

METHOD—*:  An  integer  flag  designating  which  method  to  use. 

•  METHOD  =  0  :  Newton  or  Gauss-Newton  algorithm  is  used. 

•  METHOD  =  1  :  Tensor  algorithm  is  used. 

Module  DFAULT  returns  value  1.  If  the  user  specifies  an  illegal  value,  module  INCHK  will  reset 
METHOD  to  1,  and  execution  will  continue. 


GLOBAL-*:  An  integer  flag  designating  which  global  strategy  to  use. 
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•  GLOBAL  =  0  :  Line  search  is  used. 

•  GLOBAL  =  1  :  Two-dimensional  trust  region  is  used. 

Module  DFAULT  returns  value  of  0.  If  the  user  specifies  an  illegal  value,  module  INCHK  will  reset 
GLOBAL  to  0,  and  execution  will  continue. 

STEPMX— :  A  positive  scalar  providing  the  maximum  allowable  scaled  step  length  ||Da.(a;+-:rc)||2, 
where  Dx  =  dm#(l/TYPXi , ...,  l/TYPXn).  STEPMX  is  used  to  prevent  steps  that  would  cause  the 
nonlinear  equations  problem  to  overflow,  and  to  prevent  the  algorithm  from  leaving  the  area  of 
interest  in  parameter  space.  STEPMX  should  be  chosen  small  enough  to  prevent  these  occurrences 
but  should  be  larger  than  any  anticipated  “reasonable”  step.  Module  DFAULT  returns  the  value 
STEPMX  =  103.  If  the  user  specifies  a  nonpositive  value,  module  INCHK  sets  STEPMX  to  103. 

DLT-*:  A  positive  scalar  giving  the  initial  trust  region  radius.  When  the  line  search  strat¬ 
egy  is  used,  this  parameter  is  ignored.  For  the  trust  region  algorithm,  if  DLT  is  supplied,  its 
value  should  reflect  what  the  user  considers  a  maximum  reasonable  scaled  step  length  at  the 
first  iteration.  If  DLT  is  not  supplied  (DLT  =  -1.0),  the  routine  uses  the  length  of  the  Cauchy  step 
at  the  initial  iterate  instead.  The  module  DFAULT  returns  the  value  -1.0.  If  the  user  specifies  a 
nonpositive  value,  module  INCHK  sets  DLT  =  -1.0. 

IPR-+:  The  unit  on  which  the  package  outputs  information.  DFAULT  returns  the  value  6,  which 
is  the  standard  Fortran  unit  for  the  printer. 

*WRKUNC— Workspace  used  by  UNCMIN.  The  user  must  declare  this  array  to  have  dimensions  at 
least  NCx(2[\/N]+4)  in  the  calling  routine;  if  not,  the  program  will  abort. 

*WRKNEM— *:  Workspace  used  to  store  the  Jacobian  matrix,  the  function  values  matrix  FV,  the 
tensor  matrix  ANLS,  and  working  vectors.  The  user  must  declare  this  array  to  have  dimensions 
at  least  NRMx(N+2|"\/N]  +  U)  in  the  calling  routine;  if  not,  the  program  will  abort. 

*WRKNEN-^:  Workspace  used  to  store  the  matrix  S  of  previous  directions,  the  matrix  SHAT  of 
linearly  independent  directions,  and  working  vectors.  The  user  must  declare  this  array  to  have 
dimensions  at  least  NRNx(2[v/fi]+9)  in  the  calling  routine;  if  not,  the  program  will  abort. 

*IWRKN— o  Workspace  used  to  store  the  integer  working  vectors.  The  user  must  declare  this 
array  to  have  dimensions  at  least  NRNx3  in  the  calling  routine;  if  not,  the  program  will  abort. 

ANJA— The  name  of  a  user-supplied  subroutine  that  evaluates  the  first  derivative  (Jacobian) 
of  the  function  F(x).  The  subroutine  must  be  declared  EXTERNAL  in  the  user’s  program  and 
must  conform  to  the  usage 

CALL  ANJACJAC,  X,  NRM,  N) , 

where  X  is  a  vector  of  length  N  and  the  2-dimensional  array  JAC  is  the  analytic  Jacobian  of  F  at 
X.  When  using  the  interface  N0NLQ,  if  no  analytic  Jacobian  is  supplied  (JACFLG  =  0),  the  user 
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must  use  the  dummy  name  FDAJA  as  the  value  of  this  parameter. 

*FVEC— :  The  name  of  a  user-supplied  subroutine  that  evaluates  the  function  F  at  an  arbi¬ 
trary  vector  X.  The  subroutine  must  be  declared  EXTERNAL  in  the  user’s  calling  program  and 
must  conform  to  the  usage 

CALL  FVEC (X ,  F,  M,  N), 

where  X  is  a  vector  of  length  N  and  F  is  a  vector  of  length  M.  The  subroutine  must  not  alter  the 
values  of  X. 

*MSG< - ■>:  An  integer  variable  that  the  user  may  set  on  input  to  inhibit  certain  automatic 

checks  or  to  override  certain  default  characteristics  of  the  package.  (In  the  short  call  it  should 
be  set  to  0  on  input.)  There  are  four  “message”  features  that  can  be  used  individually  or  in 
combination  as  discussed  below. 

•  MSG  =  0  :  Values  of  input  parameters,  final  results,  and  termination  code  are  printed. 

•  MSG  =  2  :  Do  not  check  user’s  analytic  Jacobian  routine  against  its  finite  difference  esti¬ 

mate.  This  may  be  necessary  if  the  user  knows  the  Jacobian  is  properly  coded,  but  the 
program  aborts  because  the  comparative  tolerance  is  too  tight.  Do  not  use  MSG  =  2  if  the 
analytic  Jacobian  is  not  supplied. 

•  MSG  =  4:  Suppress  printing  of  the  input  state,  the  final  results,  and  the  stopping  condition. 

•  MSG  =  8  :  Print  the  intermediate  results;  that  is,  the  input  state,  each  iteration  including 

the  current  iterate  xk ,  \\\D  FF(xk)\\22 ,  and  Vf(x)  =  J(x)T  DF2  F(x),  and  the  final  results 
including  the  stopping  conditions. 

The  user  may  specify  a  combination  of  features  by  setting  MSG  to  the  sum  of  the  indivddual 
components.  The  module  D FAULT  returns  a  value  of  0.  On  output,  if  the  program  has  terminated 
because  of  erroneous  input,  MSG  contains  an  error  code  indicating  the  reason. 

•  MSG  =  0  :  No  error. 

•  MSG  =  -1  :  Illegal  dimension  ,  NRM  <  M+N. 

•  MSG  =  -2  :  Illegal  dimension  ,  NRN  <  N. 

•  MSG  =  -3  :  Illegal  dimension  ,  NC  <  [\/N]- 

•  MSG  =  -4  :  Illegal  dimension  ,  M  <  0. 

•  MSG  =  -5  :  Illegal  dimension  ,  N  <  0. 

•  MSG  =  -6  :  Program  asked  to  override  check  of  analytic  Jacobian  against  finite  difference 
estimate,  but  routine  ANJA  not  supplied  (incompatible  input). 

•  MSG  =  -7  :  Probable  coding  error  in  the  user’s  analytic  Jacobian  routine  ANJA.  Analytic 
and  finite  difference  Jacobian  do  not  agree  within  the  assigned  tolerance. 

*XP^-:  An  array  of  length  N  containing  the  best  approximation  to  the  solution  z*  upon  return. 
(If  the  algorithm  has  not  converged,  the  final  iterate  is  returned). 

*FP^— :  An  array  of  length  M  containing  the  function  value  F(XP). 
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*GP<— :  An  array  of  length  N  containing  the  gradient  of  the  function  |||F(x)||22  at  XP. 

*TERMCDt—  :  An  integer  that  specifies  the  reason  the  algorithm  was  terminated. 

•  TERMCD  =  0  :  No  termination  criterion  satisfied  (occurs  if  package  terminates  because  of 
illegal  input). 

•  TERMCD  =  1  :  function  tolerance  reached.  Current  iteration  is  probably  solution. 

•  TERMCD  =  2  :  gradient  tolerance  reached.  For  nonlinear  least  squares,  current  iteration  is 
probably  solution;  for  nonlinear  equations,  this  could  either  be  solution  or  local  minimizer. 

•  TERMCD  =  3  :  Successive  iterates  within  tolerance.  Current  iterate  may  be  solution,  or 
algorithm  may  have  bogged  down  away  from  solution. 

•  TERMCD  =  4  :  Last  global  step  failed  to  locate  a  point  lower  than  XP.  It  is  likely  that  either 
XP  is  an  approximate  solution  of  the  problem  or  STEPTL  is  too  large. 

•  TERMCD  =  5  :  Iteration  limit  exceeded. 

7.  Summary  of  Default  Values 

The  following  parameters  are  returned  by  the  module  DFAULT: 

ITNLIM  =  150 
JACFLG  =  0 
IPR  =  6 
GRADTL  =  e1/3 
FTOL  =  e2/3 
STEPTL  =  e2/3 
METHOD  =  1 
GLOBAL  =  0 
STEPMX  =  10.0D+3 
DLT  =  -1.0D0 

TYPX  =  (1.0D0,  . . . ,  1.0D0) 

TYPF  =  (1.0D0,  . . . ,  1.0D0) 

MSG  =  0 


8.  Implementation  Details 

This  program  package  has  been  coded  in  Fortran  77  using  double  precision.  It  consists  of 
approximately  8700  lines  of  code,  of  which  3400  lines  are  the  software  package  UNCMIN  [12] 
which  has  been  designed  to  solve  the  unconstrained  nonlinear  optimization  problem,  and  about 
25%  are  comments.  The  total  data  storage  required  is  about  M  x  (N  +  2\/N)  +  N  x  (N ■+  4VN) 
double-precision  numbers.  The  program  was  developed  and  tested  on  a  Sun4  computer  in  the 
Computer  Science  Department  at  the  University  of  Colorado  at  Boulder. 

There  is  one  machine  dependency.  The  machine  epsilon  is  calculated  by  the  package  and  used 
in  several  places,  including  finite  differences  stepsizes  and  stopping  criteria.  On  some  computers, 
the  returned  value  may  be  incorrect  because  of  compiler  optimizations.  The  user  may  wish  to 
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check  the  computer  value  of  the  machine  epsilon  and,  if  it  is  incorrect,  replace  the  code  in  the 
subroutine  MACEPS  with  the  following  statement. 

EPS  =  correct  value  of  machine  epsilon 


9.  Example  of  Use 

In  the  example  code  shown  below  in  Figure  1,  we  first  call  the  default  routine  DFAULT  which 
returns  with  the  default  values,  then  override  the  values  of  GRADTL,  FTOL,  and  STEPTL.  Then  we 
call  the  interface  NLEQ  to  solve  the  system  of  nonlinear  equations  coded  in  FVEC.  We  arbitrarily 
base  our  storage  upon  NRM  =  100  and  NRN  =  30  to  allow  for  larger  problems  than  those  shown. 

PROGRAM  TENSLV. 

INTEGER  NRM , NRN , NC , M , N , ITNLIM ,  JACFLG , METHOD 
INTEGER  GLOBAL , IPR , MSG , TERMCD , I 
DOUBLE  PRECISION  GRADTL, STEPTL, FTOL, STEPMX, DLT 
PARAMETER  (NRM  =  100,  NRN  =30,  NC  =  6) 

INTEGER  IWRKN(NRN,3) 

DOUBLE  PRECISION  XO(NRN)  , WRKUNC(NC, 16) ,WRKNEM(NRM,53) 

DOUBLE  PRECISION  WRKNEN (NRN ,21) ,TYPX(NRN) ,TYPF (NRM) 

DOUBLE  PRECISION  XP(NRN) ,FP(NRM) ,GP(NRN) 

EXTERNAL  FDAJA , FVEC 

READ (5,*)  M,N 

READ(5 , *)  (X0(I) ,I=1,N) 

CALL  DF AULT  (M ,  N ,  ITNLIM ,  JACFLG ,  GRADTL ,  STEPTL ,  FTOL ,  METHOD , 

+  GLOBAL , STEPMX , DLT , TYPX , TYPF ,  IPR ,  MSG) 

GRADTL  =  1.0D-5 
FTOL  =  1.0D-9 
STEPTL  =  1.0D-9 

CALL  N0NLQ ( NRM , NRN , NC , X0  , M , N , TYPX , TYPF , ITNLIM , JACFLG , GRADTL , 

+  STEPTL , FTOL , METHOD ,  GLOBAL , STEPMX , DLT , IPR , WRKUNC , 

+  WRKNEM ,  WRKNEN ,  IWRKN ,  FDAJA ,  FVEC , MSG ,  XP  ,  FP ,  GP ,  TERMCD ) 

END 

Figure  1.  Driver  to  solve  a  system  of  nonlinear  equations  or  a  nonlinear 
least  squares  problem 
C 

C  The  following  is  a  subroutine  for  the  Rosenbrock  function. 

C 

SUBROUTINE  FVEC(X,F,M,N) 

INTEGER  N,M 

DOUBLE  PRECISION  X(N) ,F(M) 

F ( 1)  =  10 . 0D0* (X(2)-X(l)**2) 

F (2)  =  1.0D0-X(1) 

RETURN 

END 
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If  we  run 
M,  N:  2 
XO:  -1. 


NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 

NESLV 


RESULT 

RESULT 

RESULT 

RESULT 

RESULT 

RESULT 

RESULT 

NLSTP 

RESULT 

RESULT 

RESULT 

RESULT 

RESULT 

RESULT 

RESULT 


the  above  example  with  the  following  input: 
2 

2D0  1.0D0  the  output  will  be  as  follows: 


TYPICAL  X 

0 . 1000000000000D+01  0 . 1000000000000D+01 

DIAGONAL  SCALING  MATRIX  FOR  X 
0 . 1000000000000D+01  0 . 1000000000000D+01 

TYPICAL  F 

0 . 1000000000000D+01  0 . 1000000000000D+01 

DIAGONAL  SCALING  MATRIX  FOR  F 
0 . lOOOOOOOOOOOOD+Ol  0 . 1000000000000D+01 


JACOBIAN  FLAG 
METHOD 

GLOBAL  STRATEGY 
ITERATION  LIMIT 
MACHINE  EPSILON 
STEP  TOLERANCE 
GRADIENT  TOLERANCE 
FUNCTION  TOLERANCE 
MAXIMUM  STEP  SIZE 
TRUST  REG  RADIUS 


=0  (=1  IF  ANALYTIC  JACOBIAN  SUPPLIED) 
=1  (=1  IF  TENSOR  METHOD  USED) 

=0  (=0  IF  LINE  SEARCH  USED) 

=  150 

=  0 . 2220446049250D- 15 
=  0 . 1000000000000D-08 
=  0 . 1000000000000D-04 
=  0 . 1000000000000D-08 
=  0 . 1000000000000D+04 
=-0.1000000000000D+01 


ITERATION  K  =  0 

X(K) 

-0 . 1200000000000D+01  0 . lOOOOOOOOOOOOD+Ol 

FUNCTION  AT  X(K) 

0 . 1210000000000D+02 
GRADIENT  AT  X(K) 

-0 . 1077999998579D+03  -0 . 4400000000000D+02 

FUNCTION  VALUE  CLOSE  TO  ZERO 

ITERATION  K  =  7 

X(K) 

0 . 9999999997 177D+00  0 . 9999999994362D+00 

FUNCTION  AT  X(K) 

0 .398688 1344063D- 19 
GRADIENT  AT  X(K) 

-0 . 4270366 293595D -09  0 .7237543951455D-10 
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If  we  now  wish  to  solve  the  nonlinear  least  squares  problem  given  by  the  following  subroutine 
FVEC: 

C 

C  The  following  is  a  subroutine  for  the  Wood  function. 

C 

SUBROUTINE  FVEC(X,F ,M,N) 

INTEGER  M , N 

DOUBLE  PRECISION  X(N),F(M) 

F ( 1 )  =  10 . 0D0* (X(2) -X( 1) **2) 

F (2)  =  l.ODO-X(l) 

F (3)  =  SQRT(90.0DO)*(X(4)-X(3)**2) 

F (4)  =  1 . 0D0-X(3) 

F (5)  =  SQRT(10 .0D0)*(X(2)+X(4)-2.0D0) 

F (6)  =  (1.0D0/SQRT(10.0D0))*(X(2)-X(4)) 

RETURN 

END 

with  the  following  input: 

M,  N:  6  4 

XO:  -30.0D0  -10.0D0  -30.0D0  -10.0D0 

GLOBAL:  1  (i.e.  this  is  set  after  the  call  to  DFAULT  in  the  driver  program) 
the  output  will  be  as  follow’s: 


NESLV 

TYPICAL  X 

NESLV 

0. 1000000000000D+01 

0. 1000000000000D+01 

0.1000000000000D+01 

NESLV 

0. 1000000000000D+01 

NESLV 

DIAGONAL  SCALING  MATRIX  FOR  X 

NESLV 

0. 1000000000000D+01 

0. 1000000000000D+01 

0.1000000000000D+01 

NESLV 

0. 1000000000000D+01 

NESLV 

TYPICAL  F 

NESLV 

0. 1000000000000D+01 

0, 1000000000000D+01 

0.100000000000QD+01 

NESLV 

0. 1000000000000D+01 

0. 1000000000000D+01 

0.1000000000000D+01 

NESLV 

NESLV 

DIAGONAL  SCALING  MATRIX  FOR  F 

NESLV 

0. 1000000000000D+01 

0. 1000000000000D+01 

0.1000000000000D+01 

NESLV 

0. 1000000000000D+01 

0.1000000000000D+01 

0.1000000000000D+01 

NESLV 

NESLV 

JACOBIAN  FLAG  =0 

(=1  IF  ANALYTIC  JACOBIAN  SUPPLIED) 

NESLV 

METHOD  =1 

(=1  IF  TENSOR  METHOD  USED) 

NESLV 

GLOBAL  STRATEGY  =1 

(=0  IF  LINE  SEARCH  USED) 

NESLV 

ITERATION  LIMIT 

150 

NESLV 

MACHINE  EPSILON  =  0 

. 2220446049250D-15 

NESLV 

STEP  TOLERANCE  =  0 

. 1000000000000D-08 
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NESLV  GRADIENT  TOLERANCE  =  0 . 1000000000000D-04 
NESLV  FUNCTION  TOLERANCE  =  0 . 1000000000000D-08 
NESLV  MAXIMUM  STEP  SIZE  =  0 . 1000000000000D+04 
NESLV  TRUST  REG  RADIUS  =-0 . lOOOOOOOOOOOOD+Ol 


RESULT  ITERATION  K  =  0 

RESULT  X (K) 

RESULT  -0 . 3000000000000D+02  -0 . 1000000000000D+02  -0 . 3000000000000D+02 

RESULT  -0. 1000000000000D+02 

RESULT  FUNCTION  AT  X(K) 

RESULT  0 . 7867288 100000D+08 

RESULT  GRADIENT  AT  X(K) 

RESULT  -0 . 5460030962972D+07  -0 . 9122000000267D+05  -0 .4914030950915D+07 

RESULT  -0 . 8211996230035D+05 

NLSTP  FUNCTION  VALUE  CLOSE  TO  ZERO 

RESULT  ITERATION  K  =  5 

RESULT  X(K) 

RESULT  0. lOOOOOOOOOOOOD+Ol  0 . lOOOOOOOOOOOOD+Ol  0 . lOOOOOOOOOOOOD+Ol 

RESULT  0 . 1000000000000D+0 1 

RESULT  FUNCTION  AT  X(K) 

RESULT  0 .248886170265  ID -26 

RESULT  GRADIENT  AT  X(K) 

RESULT  0. 11786 12771710D-11  -0 . 6152522979420D-12  0 .7212008821566D-12 

RESULT  -0 . 3861688746554D-12 


10.  Test  Results 

We  have  tested  the  TENSOLVE  software  package  using  the  algorithms  described  above  on  a  variety 

of  nonsingular  and  singular  problems.  This  section  summarizes  and  discusses  the  test  results. 

In  our  tests,  the  package  terminates  successfully  if  the  relative  size  of  ( x+  —  xc )  is  less 
1  2 

than  macheps 2,  or  ||  F(x+ )  ||t30  is  less  than  macheps^ .  It  terminates  unsuccessfully  if  the 
iteration  limit  of  150  is  exceeded.  If  the  last  global  step  fails  to  locate  a  point  lower  than  xc 
in  the  line  search  or  trust  region  global  strategies,  or  the  relative  size  of  J(x+)T F(x+)  is  less 
than  macheps 3,  the  method  stops  and  reports  this  condition;  this  may  indicate  either  success 
or  failure.  All  our  computations  were  performed  on  a  Sun4  computer  in  the  Computer  Science 
Department  at  the  University  of  Colorado  at  Boulder,  using  double-precision  arithmetic. 

First  we  tested  the  software  package  on  the  set  of  nonlinear  equations  and  nonlinear  least 
squares  problems  in  More,  Garbow,  and  Hillstrom  [9].  These  problems  all  have  nonsingular 
.Jacobians  at  the  solution  with  the  exception  of  Powell’s  singular  function.  Then  we  created 
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singular  test  problems  as  proposed  in  Schnabel  and  Frank  [11]  by  modifying  the  nonsingular 
test  problems  of  More,  Gar  bow,  and  Hillstrom  to  the  form 

F(x)  =  F(x)  -  F'(z*)A(AT A)~l AT(x  -  xx),  (10.1) 

where  F(x)  is  the  standard  nonsingular  test  function,  xx  is  its  root,  or  minimizer,  and  A  6  Rmxk 
has  full  column  rank  with  1  <  k  <  n.  Note  that  x*  is  a  root  or  critical  point  of  the  modified 
problem,  and  rank  Ff(x*)  =  n-rank(A).  We  used  (10.1)  to  create  two  sets  of  singular  problems, 
with  F'(x)  having  rank  n  —  1  and  n  -  2,  respectively,  by  using 

A€Jlmx\.  At  —  (1,1, 1), 

and 

AeRmX 2,  AT  = 

respectively. 

We  tested  our  tensor  algorithm  on  17  test  functions  for  systems  of  nonlinear  equations  (also 
including  4  functions  from  [7]  whose  Jacobian  at  the  solution  z*  is  singular  and  are  designated  as 
Griewank  functions)  and  11  test  functions  for  nonlinear  least  squares.  Some  of  the  test  problems 
were  run  at  various  dimensions.  All  of  these  problems  were  also  run  with  the  standard  method. 
The  list  of  test  problems  is  given  in  Appendix  A;  the  detailed  test  results  are  given  in  [2]. 

Our  computational  results  for  the  test  problems  whose  Jacobians  at  the  solution  have  ranks 
n,  n-1,  and  n-  2  are  summarized  in  Tables  1  to  4.  In  each  of  these  tables,  columns  “Better”  and 
“Worse”  represent  the  number  of  times  the  tensor  method  was  better  and  worse,  respectively, 
than  the  standard  method  by  more  than  one  iteration.  The  “Tie”  column  represents  the  number 
of  times  the  tensor  and  standard  methods  required  within  one  iteration  of  each  other.  For  each 
set  of  problems,  we  summarize  the  comparative  costs  of  the  tensor  and  standard  methods  using 
average  ratios  of  two  measures:  iterations,  and  function  evaluations.  The  average  iteration 
ratio  is  the  total  number  of  iterations  required  by  the  tensor  method  over  all  the  problems 
included,  divided  by  the  total  number  of  iterations  required  by  the  standard  method  on  the  same 
problems.  The  same  measure  is  used  for  the  average  function  evaluation  ratio.  These  average 
ratios  include  only  problems  that  were  successfully  solved  by  both  methods.  We  have  excluded 
from  the  summary  of  statistics  all  cases  where  the  tensor  and  standard  methods  converge  to 
a  different  root,  or  to  the  same  root  as  each  other  but  not  the  singular  root  x%  in  the  case 
of  singular  problems.  However,  the  statistics  for  the  “Better,”  “Worse,”  and  “Tie”  columns 
include  the  cases  where  only  one  of  the  two  methods  converges,  and  exclude  the  cases  where 
both  methods  do  not  converge.  The  total  number  of  problems  that  were  solved  by  one  method 
but  not  the  other  are  given  in  the  last  two  columns  of  each  table. 

In  the  test  results  obtained  for  both  nonsingular  and  singular  nonlinear  equations  problems, 
the  tensor  method  virtually  never  is  less  efficient  than  the  standard  method  and  usually  is 
more  efficient.  The  improvement  by  the  tensor  method  over  the  standard  method  with  the  same 
global  strategy  is  substantial,  averaging  about  49%  in  iterations  and  41%  in  function  evaluations 
when  the  line  search  is  used,  and  about  42%  in  iterations  and  31%  in  function  evaluations  when 
the  trust  region  is  used,  on  the  problems  that  are  successfully  solved  by  both  methods.  The 
improvement  by  the  tensor  method  over  the  standard  method  is  more  dramatic  on  problems 


1  1  1 

1  -1  1 


1 

±1 


(10.2) 
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with  small  rank  deficiency  than  on  nonsingular  problems,  but  is  substantial  in  all  cases.  On  rank 
n  —  1  problems,  this  is  due  in  part  to  the  tensor  methods  achieving  3  step  Q-order  |  convergence 
whereas  the  Newton’s  method  is  linearly  convergent  with  constant  |  [6]. 

The  tensor  method  is  also  significantly  more  robust  than  the  standard  Newton-based  method 
for  the  nonlinear  equations  test  set.  Over  all  the  nonlinear  equations  test  problems,  5  rank  n 
problems,  5  rank  n  —  1  problems,  and  8  rank  n  —  2  problems  were  solved  by  the  tensor  method 
and  not  by  the  standard  method  when  the  line  search  was  used,  and  6  rank  n  problems,  4  rank 
n  —  1  problems,  and  4  rank  n  -  2  problems  were  solved  by  the  tensor  method  and  not  by  the 
standard  method  when  the  trust  region  was  used.  On  the  other  hand,  only  1  rank  n  problem 
was  solved  by  the  standard  method  and  not  by  the  tensor  method  when  the  line  search  was  used, 
and  only  1  rank  n  problem  was  solved  by  the  standard  method  and  not  by  the  tensor  method 
when  the  trust  region  was  used. 

For  the  entire  set  of  nonsingular  and  singular  nonlinear  least  squares  problems,  the  average 
improvement  of  the  tensor  method  over  the  standard  Gauss-Newton  method  also  is  substantial. 
Over  the  problems  solved  successfully  by  both  methods,  the  improvement  averages  about  52% 
in  iterations  and  53%  in  function  evaluations  when  the  line  search  is  used,  and  about  35%  in 
iterations  and  28%  in  function  evaluations  when  the  trust  region  is  used. 

The  tensor  method  is  also  considerably  more  robust  than  the  Gauss-Newton  method  for  the 
nonlinear  least  squares  test  set,  especially  in  the  line  search  comparison.  The  tensor  method 
solves  several  problems  that  the  standard  Gauss-Newton  method  does  not,  and  the  reverse 
never  occurs.  Over  all  the  nonlinear  least  squares  test  problems,  4  rank  n  problems,  2  rank 
72  —  1  problems,  and  4  rank  n  —  2  problems  were  solved  by  the  tensor  method  and  not  by  the 
standard  Gauss-Newton  method  when  the  line  search  was  used,  and  3  rank  n  problems,  1  rank 
72—1  problems,  and  1  rank  n  -  2  problems  were  solved  by  the  tensor  method  and  not  by  the 
standard  Gauss-Newton  method  when  the  trust  region  was  used.  On  the  other  hand,  there  were 
no  problems  solved  by  the  standard  Gauss-Newton  method  and  not  by  the  tensor  method  when 
either  the  line  search  or  the  trust  region  was  used. 

A  closer  examination  of  the  nonlinear  least  squares  test  results  shows  that  the  improvements 
by  the  tensor  method  are  considerably  larger  for  zero  residual  problems  than  for  nonzero  residual 
problems.  The  difference  is  most  dramatic  in  the  nonsingular  case.  Tables  5  and  6  show  the 
average  iteration  and  function  evaluation  ratios  of  the  tensor  method  versus  the  Gauss-Newton 
method  for  zero  and  nonzero  residual  problems,  respectively.  The  performance  differences  may 
be  attributable  to  the  fact  that  both  the  standard  and  tensor  methods  are  linearly  convergent 
on  nonzero  residual  problems,  but  are  more  quickly  convergent  on  zero  residual  problems. 

The  comparison  between  the  line  search  methods  and  the  trust  region  methods  is  very 
interesting,  for  both  the  standard  and  tensor  methods.  This  is  summarized  in  Tables  7  and  8. 
These  tables  show  that  on  the  average,  the  two-dimensional  trust  region  approach  often  is  more 
efficient  than  the  line  search  method,  especially  on  nonsingular  problems.  It  is  important  to 
note,  however,  that  the  line  search  method  is  simpler  to  implement  and  to  understand  than  the 
the  two-dimensional  trust  region  approach,  and  is  appreciably  faster  in  terms  of  CPU  time  on 
small,  inexpensive  problems  where  the  complexity  of  the  code  becomes  the  dominant  cost.  It 
should  also  be  noted  that  there  is  considerable  variation  in  the  comparative  efficiency  of  the  line 
search  and  trust  region  methods  on  individual  problems  and  that  either  may  be  more  efficient 
for  a  particular  problem  class. 
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Perhaps  a  more  important  consideration  in  the  general  comparison  of  the  line  search  and 
trust  region  methods,  however,  is  that  the  two-dimensional  trust  region  method  solves  consid¬ 
erably  more  of  the  test  problems  than  the  line  search  method.  The  advantage  in  robustness  is 
particularly  large  in  comparing  line  search  and  trust  region  versions  of  the  standard  methods; 
it  is  smaller  but  still  significant  in  comparing  tensor  methods  for  nonlinear  least  squares,  and 
insignificant  in  our  tests  of  tensor  methods  for  nonlinear  equations.  Over  all  the  nonlinear  equa¬ 
tions  test  problems,  1  rank  n  —  1  and  2  rank  n  —  2  problems  were  solved  by  the  trust  region  and 
not  by  the  line  search,  whereas  1  rank  n  problem  and  1  rank  n  —  1  problem  were  solved  by  the 
line  search  and  not  by  the  trust  region,  when  the  tensor  method  was  used.  On  the  other  hand, 
when  the  Newton’s  method  based  code  was  used,  6  rank  n  problems,  6  rank  n  -  1  problems, 
and  5  rank  n  -  2  problems  were  solved  by  the  trust  region  and  not  by  the  line  search,  whereas 
only  1  rank  n  problem,  1  rank  n  -  1  problem,  and  1  rank  n  -  2  problem  were  solved  by  the  line 
search  and  not  by  the  trust  region.  Over  all  the  nonlinear  least  squares  test  problems,  7  rank  n 
problems  and  3  rank  n  -  2  problems  were  solved  by  the  trust  region  and  not  by  the  line  search, 
whereas  4  rank  n  problems  were  solved  by  the  line  search  and  not  by  the  trust  region,  when 
the  tensor  method  was  used.  When  the  standard  Gauss-Newton  method  was  used,  7  rank  n 
problems,  2  rank  n  -  1  problems,  and  8  rank  n-  2  problems  were  solved  by  the  trust  region  and 
not  by  the  line  search,  whereas  only  1  rank  n  -  1  problem  and  1  rank  n  -  2  problem  were  solved 
by  the  line  search  and  not  by  the  trust  region.  Thus,  the  trust  region  version  seems  to  have  a 
considerable  advantage  over  the  line  search  version  in  its  robustness,  although  more  when  using 
the  standard  method  than  the  tensor  method.  We  note  that  the  smaller  average  improvement  of 
the  tensor  method  over  the  standard  method  in  the  trust  region  cases  (Tables  2  and  4)  than  the 
line  search  cases  (Tables  1  and  3)  is  related  to  the  difference  in  problem  sets  that  are  included  in 
these  statistics,  because  of  the  differing  robustness  of  the  fine  search  and  trust  region  methods. 

Finally,  we  compared  our  tensor  method  with  the  NL2S0L  package  [5]  on  the  set  of  nonlinear 
least  squares  problems  used  in  [5]  that  is  fisted  in  Appendix  B.  The  reason  we  were  interested 
in  making  this  comparison  is  that  the  NL2S0L  method  theoretically  is  superlinearly  convergent 
on  nonzero  residual  problems  ([5]),  whereas  the  tensor  method  of  this  paper,  like  Gauss-Newton 
methods,  is  only  linearly  convergent  on  nonzero  residual  problems.  (This  difference  is  related  to 
NL2SQL  using  a  quadratic  model  of  F{x)TF(x)  whereas  the  tensor  and  Gauss-Newton  methods 
use  models  of  F(x).)  The  problems  include  a  mixture  of  zero,  small,  and  large  residual  problems. 

Table  9  reports  the  comparative  test  results  of  the  tensor  method  versus  NL2S0L  on  this  test 
set.  The  first  row  of  Table  9  compares  the  tensor  method. using  a  fine  search  with  NL2S0L,  whereas 
the  second  row  compares  the  tensor  method  using  a  two-dimensional  trust  region  with  NL2S0L. 
(NL2S0L  uses  a  trust  region  global  strategy.)  The  table  shows  that  on  these  test  problems, 
the  tensor  method  on  the  average  is  somewhat  more  efficient  than  NL2S0L,  with  an  average 
improvement  of  about  58%  in  iterations  and  29%  in  function  evaluations  when  the  line  search  is 
used,  and  about  24%  in  iterations  and  7%  in  function  evaluations  when  the  trust  region  is  used. 
(Note  that  the  tensor  method  with  fine  search  is  more  efficient  than  the  tensor  method  with 
trust  region  on  this  test  set.)  There  is  no  difference  in  the  robustness  of  the  two  packages  of 
this  test  set;  only  1  problem  in  the  test  set  was  solved  by  NL2S0L  and  not  by  the  tensor  method 
using  either  a  line  search  or  a  trust  region  method,  and  only  1  problem  was  solved  by  the  tensor 
method  and  not  by  NL2S0L.  These  limited  results  indicate  that  the  tensor  method  appears  to 
be  quite  competitive  with  NL2S0L  for  solving  least  squares  problems. 
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Table  1:  Summary  for  Nonlinear  Equations  Test  Problems  using  Line  Search 


Rank 

F'(x.) 

Tensor 

Average  Ratio 
Tensor/  Newton 

Only 

Newton 

Solved 

Only 

Tensor 

Solved 

Better 

Worse 

Tie 

Itn 

Feval 

n 

25 

2 

13 

0.60 

0.69 

1 

5 

n  -  1 

24 

0 

8 

0.48 

0.53 

0 

5 

n  —  2 

27 

1 

5 

0.46 

0.56 

0 

8 

Table  2:  Summary  for  Nonlinear  Equations  Test  Problems  Using  Two-Dimensional  Trust  Region 


Rank 

F'(x.) 

Tensor 

Average  Ratio 
Tensor/ Newton 

Only 

Newton 

Solved 

Only 

Tensor 

Solved 

Better 

Worse 

Tie 

Itn 

Feval 

mm 

26 

3 

13 

0.61 

0.72 

1 

6 

1 

24 

1 

9 

0.49 

0.63 

0 

4 

1 

26 

1 

5 

0.64 

0.73 

0 

4 

Table  3:  Summary  for  Nonlinear  Least  Squares  Test  Problems  Using  Line  Search 


Rank 

F'(x*) 

Tensor 

Average  Ratio 
Tensor/Gauss-Newton 

Only 

Gauss-Newton 

Solved 

Only 

Tensor 

Solved 

Better 

Worse 

Tie 

Itn 

Feval 

n 

20 

1 

8 

0.52 

0.51 

0 

4 

n  —  1 

18 

0 

8 

0.45 

0.41 

0 

2 

n  —  2 

28 

0 

5 

0.48 

0.48 

0 

4 

Table  4:  Summary  for  Nonlinear  Least  Squares  Test  Problems  Using  Two-Dimensional  Trust 
Region _ _ _ 


Rank 

F’(x,) 

Tensor 

Average  Ratio 
Tensor/Gauss-Newton 

Only 

Gauss-Newton 

Solved 

Only 

Tensor 

Solved 

Better 

Worse 

Tie 

Itn 

Feval 

n 

26 

1 

5 

0.66 

0.76 

0 

3 

n  —  1 

19 

2 

5 

0.66 

0.71 

0 

1 

n  —  2 

28 

1 

4 

0.63 

0.69 

0 

1 

26 


Table  5:  Average  Ratios  of  the  Tensor  Method  versus  the  Gauss-Newton  Method  on  Zero 
Residual  Problems  for  Line  Search  and  Trust  Region _ 


Rank 

Line  Search 

Trust  Region 

F\x.) 

Itn 

Feval 

Itn 

Feval 

n 

0.43 

0.44 

0.43 

0.56 

n  -  1 

0.41 

0.37 

0.64 

0.62 

n  —  2 

0.48 

0.48 

0.51 

0.57 

Table  6:  Average  Ratios  of  the  Tensor  Method  versus  the  Gauss-Newton  Method  on  Nonzero 
Residual  Problems  for  Line  Search  and  Trust  Region _ 


Rank 

Line  Search 

Trust  Region 

F'(x*) 

Itn 

Feval 

Itn 

Feval 

n 

0.64 

0.64 

0.78 

0.88 

n  —  1 

0.48 

0.45 

0.67 

0.79 

n  —  2 

0.49 

0.48 

0.68 

0.76 

Table  7:  Average  Ratios  of  Iterations  and  Function  Evaluations  of  Newton  with  Trust  Region 
versus  Newton  with  Line  Search  and  Tensor  with  Trust  Region  versus  Tensor  with  Line  Search 
for  Nonlinear  Equations  _ 


Rank 

Newton  TR/LS 

Tensor  TR/LS 

F’(x.) 

Itn 

Feval 

Itn 

Feval 

n 

0.80 

0.84 

0.70 

0.57 

n  —  1 

0.78 

0.89 

0.96 

0.93 

n  —  2 

0.86 

0.93 

0.92 

0.94 

Table  8:  Average  Ratios  of  Iterations  and  Function  Evaluations  of  Gauss-Newton  with  Trust 
Region  versus  Gauss-Newton  with  Line  Search  and  Tensor  with  Trust  Region  versus  Tensor  with 
Line  Search  for  Nonlinear  Least  Squares  Problems _ 


Rank 

Gauss-Newton  TR/LS 

Tensor  TR/LS 

F'(x.) 

Itn 

Feval 

Itn 

Feval 

n 

0.70 

0.65 

0.75 

0.76 

71  —  1 

0.72 

0.71 

1.05 

1.09 

n  —  2 

1.01 

0.97 

0.74 

0.80 

27 


Table  9:  Comparison  of  Tensor  Method  with  NL2S0L  on  the  Nonlinear  Equations  and  Nonlinear 
Least  Squares  Problems  Listed  in  Table  B-l 


Global  strategy 

Tensor  versus  NL2S0L 

Average  Ratio-Tensor/NL2S0L 

Better 

Worse 

Tie 

Itn 

Feval 

Tensor  w/  LS 

25 

8 

2 

0.42 

0.71 

Tensor  w /  TR 

24 

9 

2 

0.76 

0.93 
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Appendix  A 

The  columns  in  Tables  A-l  and  B-l  have  the  following  meanings: 

-  Problem:  name  of  the  problem. 

-  m:  number  of  equations. 

-  n:  number  of  variables. 

-  NS:  dimension  of  nullspace  for  Griewank’s  singular  functions. 

-  OS:  order  of  singularity  for  Griewank’s  singular  functions. 


Table  A-l:  List  of  Nonlinear  Equations  and  Nonlinear  Least  Squares  Test  Problems  Used  in  the 
Comparison  of  Tensor  Method  versus  Standard  Method _ 


Problem 

Dimension 

m 

n 

Brown  almost  linear 

10 

10 

Broyden  banded 

30 

30 

Broyden  tridiagonal 

30 

30 

Chebyquad 

7 

7 

Discrete  boundary 

30 

30 

Discrete  integral 

10 

10 

Helical  valley 

3 

3 

Powell  singular 

4 

4 

Rosenbrock 

2 

2 

Trigonometric 

30 

30 

Variable  dimension 

10 

10 

Watson 

31 

31 

Wood  gradient 

4 

4 

NS  =  1  OS  =  1 

3 

3 

NS  =  2  OS  =  1 

3 

3 

NS  =  1  OS  =  2 

3 

3 

NS  =  2  OS  =  2 

3 

3 
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Table  A-l:  List  of  Nonlinear  Equations  and  Nonlinear  Least  Squares  Test  Problems  Used  in  the 
Comparison  of  Tensor  Method  versus  Standard  Method  (continued) 


Problem 

Dimension 

m 

n 

Wood 

6 

4 

Variable  dimension 

12 

10 

Bard 

15 

3 

Beale 

3 

2 

Kowalik 

11 

4 

Penalty  1 

11 

10 

Penalty2 

10 

5 

Brown  badly  scaled 

3 

2 

Gauss  function 

15 

3 

Brown  and  Dennis 

10 

4 

Chebyquad 

8 

4 

Chebyquad 

12 

4 

Chebyquad 

16 

4 

Appendix  B 


Table  B-l:  List  of  Nonlinear  Equations  and  Nonlinear  Least  Squares  Test  Problems  Used  in  the 
Comparison  of  Tensor  Method  versus  NL2S0L _ 


Problem 

Dimension 

m 

n 

Rosenbrock 

O 

L 

2 

Helical  Valley 

3 

3 

Powell  Singular 

4 

4 

Wood 

6 

4 

Beale 

3 

2 

Box  three-dimensional 

10 

3 

Freudenstein  and  Roth 

2 

2 
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Table  B-l:  List  of  Nonlinear  Equations  and  Nonlinear  Least  Squares  Test  Problems  Used  in  the 
Comparison  of  Tensor  Method  versus  NL2S0L  (continued) 


Problem 

Dimension 

m 

n 

Watson 

31 

6 

Watson 

31 

9 

Watson 

31 

12 

Watson 

31 

20 

Chebyquad 

8 

8 

Bard 

15 

3 

Jennrich  and  Sampson 

10 

2 

Kowalik 

11 

4 

Osborne  1 

33 

5 

Osborne  2 

65 

11 
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